[clang] diagnose invalid std::tuple_size sizes #159677

mizvekov · 2025-09-18T23:32:58Z

This makes sure the tuple sizes remain within implementation limits, and this doesn't cause the compiler to crash later, as the tuple size is assumed to fit within an UnsignedOrNone.

Fixes #159563

llvmbot · 2025-09-18T23:33:33Z

@llvm/pr-subscribers-clang

Author: Matheus Izvekov (mizvekov)

Changes

Fixes #159563

Full diff: https://github.com/llvm/llvm-project/pull/159677.diff

4 Files Affected:

(modified) clang/docs/ReleaseNotes.rst (+4-2)
(modified) clang/include/clang/Basic/DiagnosticSemaKinds.td (+3)
(modified) clang/lib/Sema/SemaDeclCXX.cpp (+10)
(modified) clang/test/SemaCXX/builtin-structured-binding-size.cpp (+18-2)

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index c898784b3f93e..ec14594761f3c 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -262,6 +262,8 @@ Improvements to Clang's diagnostics
   Moved the warning for a missing (though implied) attribute on a redeclaration into this group.
   Added a new warning in this group for the case where the attribute is missing/implicit on
   an override of a virtual method.
+- Implemented diagnostics when retrieving the tuple size for types where its specialization of `std::tuple_size`
+  produces an invalid size (either negative or greater than the implementation limit). (#GH159563)
 - Fixed fix-it hint for fold expressions. Clang now correctly places the suggested right
   parenthesis when diagnosing malformed fold expressions. (#GH151787)
 - Added fix-it hint for when scoped enumerations require explicit conversions for binary operations. (#GH24265)
@@ -347,8 +349,8 @@ Bug Fixes in This Version
   and vector of 4 ``float`` values. (#GH155405)
 - Fixed inconsistent shadow warnings for lambda capture of structured bindings.
   Previously, ``[val = val]`` (regular parameter) produced no warnings with ``-Wshadow``
-  while ``[a = a]`` (where ``a`` is from ``auto [a, b] = std::make_pair(1, 2)``) 
-  incorrectly produced warnings. Both cases now consistently show no warnings with 
+  while ``[a = a]`` (where ``a`` is from ``auto [a, b] = std::make_pair(1, 2)``)
+  incorrectly produced warnings. Both cases now consistently show no warnings with
   ``-Wshadow`` and show uncaptured-local warnings with ``-Wshadow-all``. (#GH68605)
 - Fixed a failed assertion with a negative limit parameter value inside of
   ``__has_embed``. (#GH157842)
diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index 8b862ae47af89..42608debe8609 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -638,6 +638,9 @@ def err_decomp_decl_std_tuple_element_not_specialized : Error<
 def err_decomp_decl_std_tuple_size_not_constant : Error<
   "cannot decompose this type; 'std::tuple_size<%0>::value' "
   "is not a valid integral constant expression">;
+def err_decomp_decl_std_tuple_size_invalid
+    : Error<"cannot decompose this type; 'std::tuple_size<%0>::value' "
+            "is not a valid size: %1">;
 def note_in_binding_decl_init : Note<
   "in implicit initialization of binding declaration %0">;
 def err_arg_is_not_destructurable : Error<
diff --git a/clang/lib/Sema/SemaDeclCXX.cpp b/clang/lib/Sema/SemaDeclCXX.cpp
index fb57b43882911..e133022426cff 100644
--- a/clang/lib/Sema/SemaDeclCXX.cpp
+++ b/clang/lib/Sema/SemaDeclCXX.cpp
@@ -1222,6 +1222,16 @@ static IsTupleLike isTupleLike(Sema &S, SourceLocation Loc, QualType T,
   if (E.isInvalid())
     return IsTupleLike::Error;
 
+  if (Size < 0 || Size >= UINT_MAX) {
+    llvm::SmallVector<char, 16> Str;
+    Size.toString(Str);
+    S.Diag(Loc, diag::err_decomp_decl_std_tuple_size_invalid)
+        << printTemplateArgs(S.Context.getPrintingPolicy(), Args,
+                             /*Params=*/nullptr)
+        << StringRef(Str.data(), Str.size());
+    return IsTupleLike::Error;
+  }
+
   return IsTupleLike::TupleLike;
 }
 
diff --git a/clang/test/SemaCXX/builtin-structured-binding-size.cpp b/clang/test/SemaCXX/builtin-structured-binding-size.cpp
index 53576048754ab..bcd13a6fb720d 100644
--- a/clang/test/SemaCXX/builtin-structured-binding-size.cpp
+++ b/clang/test/SemaCXX/builtin-structured-binding-size.cpp
@@ -1,5 +1,5 @@
-// RUN: %clang_cc1 %s -std=c++2c -fsyntax-only -verify
-// RUN: %clang_cc1 %s -std=c++2c -fsyntax-only -verify -fexperimental-new-constant-interpreter
+// RUN: %clang_cc1 %s -triple=x86_64 -std=c++2c -fsyntax-only -verify
+// RUN: %clang_cc1 %s -triple=x86_64 -std=c++2c -fsyntax-only -verify -fexperimental-new-constant-interpreter
 
 
 struct S0 {};
@@ -229,3 +229,19 @@ static_assert(__is_same_as(tag_of_t<S1>, int));
 static_assert(__is_same_as(tag_of_t<int>, int)); // error
 // expected-error@-1 {{constraints not satisfied for alias template 'tag_of_t' [with T = int]}}
 // expected-note@#tag-of-constr {{because substituted constraint expression is ill-formed: type 'int' cannot be decomposed}}
+
+struct MinusOne;
+template <> struct ::std::tuple_size<MinusOne> {
+  static constexpr int value = -1;
+};
+int minus_one = __builtin_structured_binding_size(MinusOne);
+// expected-error@-1 {{cannot decompose this type; 'std::tuple_size<MinusOne>::value' is not a valid size: -1}}
+// expected-error@-2 {{type 'MinusOne' cannot be decomposed}}
+
+struct UintMax;
+template <> struct ::std::tuple_size<UintMax> {
+  static constexpr unsigned value = -1;
+};
+int uint_max = __builtin_structured_binding_size(UintMax);
+// expected-error@-1 {{cannot decompose this type; 'std::tuple_size<UintMax>::value' is not a valid size: 4294967295}}
+// expected-error@-2 {{type 'UintMax' cannot be decomposed}}

shafik

Can you please add a more complete description of the actual underlying problem something along the lines of what I added here: #159579 (comment)

It took me and others a bit to get the original problem and so the description is valuable.

Also this is important documentation for the git log and future devs.

tbaederr · 2025-09-19T06:17:17Z

clang/lib/Sema/SemaDeclCXX.cpp


+  if (Size < 0 || Size >= UINT_MAX) {
+    llvm::SmallVector<char, 16> Str;
+    Size.toString(Str);


If we don't format the value and it's lower than UINT_MAX anyway, just calling getExtValue() should work?

I am not sure what you mean, here the value can be arbitrarily larger than UINT_MAX, so calling getExtValue can truncate them.

tbaederr · 2025-09-19T06:18:26Z

clang/lib/Sema/SemaDeclCXX.cpp

  if (E.isInvalid())
    return IsTupleLike::Error;

+  if (Size < 0 || Size >= UINT_MAX) {


Why is this UINT_MAX but the calling code uses Ctx.getSizeType()?

This test doesn't seem all that right. I think we should be checking if (Size.isNegative() || !Size.isSignedIntN(Ctx.getTypeSize(Ctx.getSizeType())) or something like that like that ?

(should that just be isIntN?).

host-compiler UINT_MAX is almost definitely the wrong thing here.

The host compiler uintmax is the right thing, because we pass this down on an UnsignedOrNone, which uses unsigned as the representation, so the implementation limit is uintmax-1.

Oh! sure, yeah, that is reasonable.

cor3ntin · 2025-09-19T07:58:28Z

@tbaederr @mizvekov Is the plan for Tim's patch to be backported and for this one to be in 22? This would make sense.
I reported a core issue here https://cplusplus.github.io/CWG/issues/3071.html

erichkeane · 2025-09-19T13:32:14Z

clang/lib/Sema/SemaDeclCXX.cpp


+  if (Size < 0 || Size >= UINT_MAX) {
+    llvm::SmallVector<char, 16> Str;
+    Size.toString(Str);


Why the separately formatting? ADT/APInt.h already ahs an operator <<, so we should be able to just << Size.

This isn't a raw_ostream though. We usually pass toString(Size, 10) or similar.

Ah, hrmph... It seems to me that StreamingDiagnostic should have something that defers to raw_ostream but I see we don't have one. That is unfortunate.

erichkeane · 2025-09-19T13:37:09Z

clang/lib/Sema/SemaDeclCXX.cpp

  if (E.isInvalid())
    return IsTupleLike::Error;

+  if (Size < 0 || Size >= UINT_MAX) {


This test doesn't seem all that right. I think we should be checking if (Size.isNegative() || !Size.isSignedIntN(Ctx.getTypeSize(Ctx.getSizeType())) or something like that like that ?

(should that just be isIntN?).

host-compiler UINT_MAX is almost definitely the wrong thing here.

mizvekov · 2025-09-19T17:38:52Z

@tbaederr @mizvekov Is the plan for Tim's patch to be backported and for this one to be in 22? This would make sense. I reported a core issue here https://cplusplus.github.io/CWG/issues/3071.html

I mean do we need to use the other patch at all? Why not just backport this one?

Note that besides not producing a diagnostic with an explanation for the negative case, it also allows values larger than the implementation limit, which will cause wrap around in the compiler, potentially leading to compiler UB anyway.

zyn0217 · 2025-09-19T18:18:37Z

I mean do we need to use the other patch at all? Why not just backport this one?

I remembered adding a new diagnostic in a backport patch could cause an ABI break, since we can somehow change the enumeration order in the tablegen files.

For example, #139396 (review)

Fixes #159563

erichkeane · 2025-09-19T18:22:55Z

I mean do we need to use the other patch at all? Why not just backport this one?

I remembered adding a new diagnostic in a backport patch could cause an ABI break, since we can somehow change the enumeration order in the tablegen files.

For example, #139396 (review)

Yeah, we can't backport new diagnostics unfortunately.

mizvekov requested review from AaronBallman, cor3ntin, erichkeane and zyn0217 September 18, 2025 23:32

mizvekov self-assigned this Sep 18, 2025

llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Sep 18, 2025

mizvekov requested a review from shafik September 18, 2025 23:33

zyn0217 approved these changes Sep 19, 2025

View reviewed changes

zwuis requested a review from tbaederr September 19, 2025 00:45

mizvekov mentioned this pull request Sep 19, 2025

[clang][Sema] Reject negative tuple sizes #159579

Closed

shafik reviewed Sep 19, 2025

View reviewed changes

tbaederr reviewed Sep 19, 2025

View reviewed changes

erichkeane reviewed Sep 19, 2025

View reviewed changes

mizvekov force-pushed the users/mizvekov/GH159563 branch from 94efaa9 to 2b8e0e6 Compare September 19, 2025 17:54

[clang] diagnose invalid std::tuple_size sizes

97daa1e

Fixes #159563

mizvekov force-pushed the users/mizvekov/GH159563 branch from 2b8e0e6 to 97daa1e Compare September 19, 2025 18:21

erichkeane approved these changes Sep 19, 2025

View reviewed changes

mizvekov merged commit 46ce6a0 into main Sep 19, 2025
8 of 10 checks passed

mizvekov deleted the users/mizvekov/GH159563 branch September 19, 2025 19:18

[clang] diagnose invalid std::tuple_size sizes #159677

[clang] diagnose invalid std::tuple_size sizes #159677

Conversation

mizvekov commented Sep 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Sep 18, 2025

Uh oh!

shafik left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cor3ntin commented Sep 19, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mizvekov commented Sep 19, 2025

Uh oh!

zyn0217 commented Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

erichkeane commented Sep 19, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

mizvekov commented Sep 18, 2025 •

edited

Loading

zyn0217 commented Sep 19, 2025 •

edited

Loading